25 research outputs found

    BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Get PDF
    BACKGROUND: Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. RESULTS: BIRCH (Biological Research Computing Hierarchy) is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment) graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. CONCLUSION: BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere

    Enhanced whole genome sequence and annotation of Clostridium stercorarium DSM8532T using RNA-seq transcriptomics and high-throughput proteomics

    Get PDF
    BACKGROUND: Growing interest in cellulolytic clostridia with potential for consolidated biofuels production is mitigated by low conversion of raw substrates to desired end products. Strategies to improve conversion are likely to benefit from emerging techniques to define molecular systems biology of these organisms. Clostridium stercorarium DSM8532(T) is an anaerobic thermophile with demonstrated high ethanol production on cellulose and hemicellulose. Although several lignocellulolytic enzymes in this organism have been well-characterized, details concerning carbohydrate transporters and central metabolism have not been described. Therefore, the goal of this study is to define an improved whole genome sequence (WGS) for this organism using in-depth molecular profiling by RNA-seq transcriptomics and tandem mass spectrometry-based proteomics. RESULTS: A paired-end Roche/454 WGS assembly was closed through application of an in silico algorithm designed to resolve repetitive sequence regions, resulting in a circular replicon with one gap and a region of 2 kilobases with 10 ambiguous bases. RNA-seq transcriptomics resulted in nearly complete coverage of the genome, identifying errors in homopolymer length attributable to 454 sequencing. Peptide sequences resulting from high-throughput tandem mass spectrometry of trypsin-digested protein extracts were mapped to 1,755 annotated proteins (68% of all protein-coding regions). Proteogenomic analysis confirmed the quality of annotation and improvement pipelines, identifying a missing gene and an alternative reading frame. Peptide coverage of genes hypothetically involved in substrate hydrolysis, transport and utilization confirmed multiple pathways for glycolysis, pyruvate conversion and recycling of intermediates. No sequences homologous to transaldolase, a central enzyme in the pentose phosphate pathway, were observed by any method, despite demonstrated growth of this organism on xylose and xylan hemicellulose. CONCLUSIONS: Complementary omics techniques confirm the quality of genome sequence assembly, annotation and error-reporting. Nearly complete genome coverage by RNA-seq likely indicates background DNA in RNA extracts, however these preps resulted in WGS enhancement and transcriptome profiling in a single Illumina run. No detection of transaldolase by any method despite xylose utilization by this organism indicates an alternative pathway for sedoheptulose-7-phosphate degradation. This report combines next-generation omics techniques to elucidate previously undefined features of substrate transport and central metabolism for this organism and its potential for consolidated biofuels production from lignocellulose. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-567) contains supplementary material, which is available to authorized users

    Network computing: restructuring how scientists use computers, and what we get out of them

    No full text
    This article describes the network computer (NC), an alternative to the standalone PC. By shifting data storage and most processing to the server, any user can do any task from any NC. NCs provide a reliable, consistent interface for all users, and make it easy to provide group access to resources such as laborator

    Sequence data analysis and preprocessing for oligo probe design in microbial genomes

    No full text
    A good oligo probe design in DNA microarray experiments is crucial to obtain the better results of gene expression analysis. However, sequence data from a very large microbial genome or pan-genome will produce a reduced number of oligos and affect the design quality if processed by a probe designer. Gene redundancies and discrepancies across resources of the same species or strain and their sequence similarity and homology are responsible for the poor quantity of oligos designed. We addressed these issues and problems with sequences and introduced the concept of open reading frame (ORF) sequence segmentation from which quality oligos can be selected. Analysis and pre-processing of sequence data were performed using our Perl-based pipeline ORF-Purger 2.0. ORFs were purged of redundancy, discrepancy, invalidity, overlapping, similarity and, optionally, homology, such that the quantity and quality of oligos to be designed were drastically improved. Probe integrity was proposed as the first probe selection criterion since the fully physical availability of all possible probes corresponding to their targets in a nucleic acid sample is necessary for a best probe design

    Genome Sequence Analysis of the Oleaginous Yeast, Rhodotorula diobovata, and Comparison of the Carotenogenic and Oleaginous Pathway Genes and Gene Products with Other Oleaginous Yeasts

    No full text
    Rhodotorula diobovata is an oleaginous and carotenogenic yeast, useful for diverse biotechnological applications. To understand the molecular basis of its potential applications, the genome was sequenced using the Illumina MiSeq and Ion Torrent platforms, assembled by AbySS, and annotated using the JGI annotation pipeline. The genome size, 21.1 MB, was similar to that of the biotechnological “workhorse”, R. toruloides. Comparative analyses of the R. diobovata genome sequence with those of other Rhodotorula species, Yarrowia lipolytica, Phaffia rhodozyma, Lipomyces starkeyi, and Sporidiobolus salmonicolor, were conducted, with emphasis on the carotenoid and neutral lipid biosynthesis pathways. Amino acid sequence alignments of key enzymes in the lipid biosynthesis pathway revealed why the activity of malic enzyme and ATP-citrate lyase may be ambiguous in Y. lipolytica and L. starkeyi. Phylogenetic analysis showed a close relationship between R. diobovata and R. graminis WP1. Dot-plot analysis of the coding sequences of the genes crtYB and ME1 corroborated sequence homologies between sequences from R. diobovata and R. graminis. There was, however, nonsequential alignment between crtYB CDS sequences from R. diobovata and those from X. dendrorhous. This research presents the first genome analysis of R. diobovata with a focus on its biotechnological potential as a lipid and carotenoid producer

    Genomic Comparison of Facultatively Anaerobic and Obligatory Aerobic Caldibacillus debilis Strains GB1 and Tf Helps Explain Physiological Differences

    No full text
    Caldibacillus debilis strains GB1 and Tf display distinct phenotypes. C. debilis GB1 is capable of anaerobic growth and can synthesize ethanol while C. debilis Tf cannot. Comparison of the GB1 and Tf genome sequences revealed that the genomes were highly similar in gene content and showed a high level of synteny. At the genome scale, there were several large sections of DNA that appeared to be from lateral gene transfer into the GB1 genome. Tf did have unique genetic content but at a much smaller scale; 300 genes in Tf verses 857 genes in GB1 that matched at â ¤90% sequence similarity. Gene complement and copy number of genes for the glycolysis, tricarboxylic acid (TCA) cycle, and electron transport chain (ETC) pathways were identical in both strains. While Tf is an obligate aerobe, it possesses the gene complement for an anaerobic lifestyle (ldh, ak, pta, adhE, pfl). As a species, other strains of C. debilis should be expected to have the potential for anaerobic growth. Assaying the whole cell lysate for ADH activity revealed an approximately 2-fold increase in the enzymatic activity in GB1 when compared to TfThe accepted manuscript in pdf format is listed with the files at the bottom of this page. The presentation of the authors' names and (or) special characters in the title of the manuscript may differ slightly between what is listed on this page and what is listed in the pdf file of the accepted manuscript; that in the pdf file of the accepted manuscript is what was submitted by the author
    corecore